How to Deal with Large Dataset, Class Imbalance and Binary Output in SVM based Response Model
نویسندگان
چکیده
Support Vector Machine (SVM) employs Structural Risk Minimization (SRM) principle to generalize better than conventional machine learning methods employing the traditional Empirical Risk Minimization (ERM) principle. When applying SVM to response modeling in direct marketing, however, one has to deal with the practical difficulties: large training data, class imbalance and binary SVM output. This paper proposes ways to alleviate or solve the addressed difficulties through informative sampling, use of different costs for different classes, and use of distance to decision boundary. This paper also provides various evaluation measures for response models in terms of accuracies, lift chart analysis and computational efficiency.
منابع مشابه
Research on Credit Card Fraud Detection Model Based on Class Weighted Support Vector Machine
To deal with credit card fraud, a detection model based on Class Weighted Support Vector Machine was established. Due to large-scale and high dimensions of data, Principal Component Analysis (PCA) was adopted firstly to screen out the main factors from a great deal of indicative attributes in order to reduce the training dimension of SVM effectively. Then according to the characteristics of cre...
متن کاملExtracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...
متن کاملشناسایی نوع و مدل وسیله نقلیه با استفاده از مجموعه بخشهای متمایزکننده
In fine-grained recognition, the main category of object is well known and the goal is to determine the subcategory or fine-grained category. Vehicle make and model recognition (VMMR) is a fine-grained classification problem. It includes several challenges like the large number of classes, substantial inner-class and small inter-class distance. VMMR can be utilized when license plate numbers ca...
متن کاملADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION
With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...
متن کاملThe Use of the Binary Bat Algorithm in Improving the Accuracy of Breast Cancer Diagnosis
Introduction: The early diagnosis of breast cancer as prevalent cancer among women, is a necessity in the research on cancers since it could simplify the clinical management of other patients. The importance of the classification of breast cancer patients into high- or low-risk groups has led research groups in the biomedical and informatics departments to evaluate and use computer techniques s...
متن کامل